Number of Reads.

Absolute numbers

Proportions

Transcriptome mapping

Plot shows the number of reads in each sample. The red shows the total amount of reads, the blue how many of those reads were uniquely mapped, that is aligned to one unique site in the genome, and the green shows how many of the the uniquely aligned were aligned in genes (exons). In the absolute plot that absolute number of reads are shown.

In the “Proportions” plot the reads are relative and the total amount of reads are set as one. This makes it easier to see and compare if the different type of reads are proportionally different between samples. The mapping was unfortunately a bit lower than we usually expect from RNA seq data which is usually around 80-85%. However, the number of reads mapped to genes is well over 10 million so there shouldn’t any problems with downstream differential expression analysis.

The transcriptome mapping, shows the mapping rate in percent when mapping against annotated transcripts instead of the full genome using Salmon.


Samples Table

Table with the samples used for the analysis.

Comparisons Table

Comparison that were done to detect differentially regulated genes. A few different approached were used for the analysis. One using the general linear models approach (GLM) and the Voom approach were samples and genes are assigned weights and then modeled using linear models (Voom). Also genes were filtered, to remove non/low expressed genes. Filtering was done by keeping genes that had 1 count per million in 3 or more samples (cpm1_3).

The samples were split to negative and positive and analysis for differential expression was performed independently within those two groups. However, in case it is needed, to make it easier to compare the gene filtering was performed on all samples together so the same genes are analyzed in both the positive and negative samples.


Normalization

cpm1_3 boxplot

cpm1_3 density

Boxplot of the samples after normalization (counts per million (CPM) + weighted trimmed mean of M-values (TMM)) show the distribution of the tag counts for all the samples. One boxplot for each of the filtering approaches is shown.


Number of Significant Genes

Gene

cpm1_3 Pos GLM

cpm1_3 Pos Voom

cpm1_3 Neg GLM

cpm1_3 Neg Voom

Transcript

cpm1_3 Pos GLM

cpm1_3 Pos Voom

cpm1_3 Neg GLM

cpm1_3 Neg Voom

Tables showing the number of significant genes obtained in each comparison using the different approaches. The table shows genes found significant after p value adjustment for multiple hypothesis testing (FDR). Genes were termed significant if the adjusted p value was under 0.05.

Analysis was performed on gene level as well as on transcript level after transcript quantification using Salmon.

GLM = EdgeR using the GLM pipeline
Voom = EdgeR/limma using voom with sample weights to transform the data to be subsequently analyzed in limma


MA/Volcano plots

cpm1_3 voom MA

cpm1_3 voom Vol

MA and Volcano plot of all the comparisons performed.

Each point represents a gene. The MA plot illustrates the magnitude of the expression change (y axis, log2 fold change) and abundance (x axis, log counts per million). Colors indicate the significantly regulated genes. Top 25 most significant genes are indicated by their gene symbols.

The volcano plot illustrates the magnitude of the expression change (x axis, log2 fold change) and their significane (y axis) by iverting the pvalues (-log10(pvalue)) so higher the better. Colors indicate the significantly regulated genes. Top 25 most significant genes are indicated a black outline.


Heatmaps

cpm1_3 voom

Heatmap and clustering of genes. The genes shown in the heatmap are genes that were found significantly regulated in any of the comparisons. Each gene (row) is standardized (z) to mean = 0 and sd = 1 and then clustered by hierarchical clustering. Hierarchical clustering was also aplied to the samples (columns).

voom = Heatmap shows all the genes found significant in any of the comparisons in the Voom analysis


Similarity

LogFC similarity cpm1_3 voom

Gene venn cpm1_3 voom

LogFC heatmap showing the correlation between the log2 fold changes of all the comparisons made.

Venn diagram shows the significant gene overlap between the different comparisons.


Gene clusters

cpm1_3 voom

Boxplots of clusters of significantly regulated genes. The expression of the genes is averaged over all the samples in the same group. Then, genes that were found regulated using were clustered using self organizing maps (som) to 6 distinct clusters. For each boxplot also the number of genes that fall in each cluster are shown. This plot help identify genes that have a similar expression patterns across the different groups.


GO umap plots

interactive

v2_v1_pos

mix
up
down

v5_v4_pos

mix
up
down

non-interactive

v2_v1_pos

v5_v4_pos

GO analysis on the significant genes obtained from each comparison.

Analysis was done on the upregulated and downregulated genes separately (up/down) and all the significant genes together (mixed).

The above plots are a umap/summation of the top 150 GOs obtained in each analysis. The top 150 GOs in each analysis are combined and the semantic similarity is calculated between them. This similarity is used to perform umap dimensionality reduction. The umap is then clustered by hierarchical clustering and split to major clusters using dynamic tree cutting. The GOs are then separated again and plotted separately to their corresponding analysis they were found enriched (up/down/mix). This way one can assess if there are GO clusters/groups that are more common or even exclusive to the up or down regulated genes.

Words indicate the most common words found in the GO terms of each cluster. In the interactive plots, hovering with the mouse pointer above a point will show the name of the term, and regions of the graphs can be zoomed in and other functions. Mix refers to the GO analysis being done on all the significant genes, in the excel files of the GO analysis GO analysis was also performed separately on the up and down regulated genes.

In the interactive plots, by hovering with the mouse pointer above a point one can see the GO term for that point. It is also possible to zoom in to each cluster.